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FIG. 1. States involved in the Helstrom measurement. The nonorthogonal states to be distinguished are \ip±)- If these 
states are real- valued, then they can be represented in a plane. The angle can also be taken to be acute: if the angle is obtuse, 
then we can rotate one of the states by 7r radians without altering its physical meaning, and the new pair of states will subtend 
an acute angle. Any such pair of states can be written as \il>±) — cos#| + )±sin#| — }, where |±) are some pair of orthogonal 
states. If the states \ip±) have equal a priori probabilities, then the optimal measurement is a von Neumann measurement in 
the basis \lo±) — ( + )± — })/\/2, which gives the minimum error probability. 
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FIG. 2. Depiction of the apparatus used in the Barnett-Riis experiment. Weak optical pulses (~0.1 photons/pulse) were 
prepared in a state of horizontal polarisation |<-»). These then encountered a Glan- Thompson polariser. This is a polarisation 
filter, whose transmission axis was set to be in either the direction of |V>+) or \tp-)- The pulse either emerged in one of these 
states, or was absorbed. The output pulses then encountered a polarising beamsplitter PBS. This was oriented at n/4 radians 
to the horizontal, so as to transmit the component in the direction + |)) to detector D+, and reflect the component in 
the orthogonal direction (|<->) — |J)), towards the detector Z?_. If a detection at D+ is taken to indicate that the initial state 
was \tp+), and likewise with D_ and IV"-}, then this measurement gives the minimum probability of error. 
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FIG. 3. Results of the Barnett-Riis experiment to discriminate between two optical polarisation states, 
\tp±) = cosO\ + }±sin#| — ), with minimum error probability. The error probability is shown versus the angle 6. Experi- 
mental data for specific values of 6 are shown in comparison with the continuous Helstrom limit in Eq. (2.10). Clearly the 
agreement between theory and experiment is excellent. 
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FIG. 4. Diagram illustrating the general structure of a quantum operation. A quantum system is initially prepared in some 
state, represented by the density operator p. It is then coupled to an ancilla system. The pair of systems evolves for some time 
according to some unitary operator U, following which a von Neumann measurement may be performed on the ancilla. The 
final state if the original system depends upon whether or not the measurement result is recorded. If it is, and result uj k is 
obtained, then the state of the original system undergoes the transformation p^p k = A k pA\/ P(u) k \p) , where the A k are linear 
operators satisfying A\A k — 1. Also, P(u)k\p) is the probability of obtaining the result uik, and is given by TrpA{A k . If, 
on the other hand, the measurement result is not recorded, or no measurement is performed on the ancilla, then the final state 
of the system is simply the sum of all p k weighted by their respective probabilities, that is, p^p = ^ fe P(wfc|p)p fc . 
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FIG. 5. Illustration of states forming the 'trine' ensemble. The states \4>j), where j = 1,2,3, are linear combinations of 
a pair of orthogonal states |±). They have a real representation in this basis, which enables us to represent them in a plane. 
Each state is obtained from its predecessor by a unitary transformation, this being a counterclockwise rotation of 2n/3 radians. 
Three consecutive such rotations are equivalent to the identity operator. As such, applying the rotation to the final state \tp3) 
gives the initial state \tpi). 
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FIG. 6. Optical measurement to unambiguously discriminate between the non-orthogonal photon polarisation states 
\tp±) — cos#j|)±sin#|^} at the Ivanovic-Dieks- Peres bound. The photon enters the apparatus at the bottom left, where a 
polarising beamsplitter PBS1 transmits the horizontal component and reflects the vertical one. The horizontal component 
propagates counter-clockwise to a second polarising beamsplitter PBS2. Meanwhile, the vertical component encounters an 
ordinary beamsplitter BS with transmission coefficient t given by Eq. (3.7). ff the photon is transmitted, it will register at 
detector D?. Both states \tp±) have equal probability of giving rise to a detection here, so such outcomes give no information 
about the state. These are inconclusive results. If, on the other hand, this component is reflected, its amplitude is multiplied 
by the reflection coefficient r = \A — t 2 , which results in both the vertical and horizontal components of both the states being 
equal. When the horizontal and vertical components are recombined by polarising beamsplitter PBS2, the resulting states, 
corresponding to whether the initial state was \ip+) or \ip-)- These states can thus be unambiguously discriminated by a final, 
suitably oriented polarising beamsplitter PBS3. 
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There are fundamental limits to the accuracy with which 
one can determine the state of a quantum system. I give an 
overview of the main approaches to quantum state discrimina- 
tion. Several strategies exist. In quantum hypothesis testing, 
a quantum system is prepared in a member of a known, fi- 
nite set of states, and the aim is to guess which one with 
the minimum probability of error. Error free discrimination 
is also sometimes possible, if we allow for the possibility of 
obtaining inconclusive results. If no prior information about 
the state is provided, then it is impractical to try to deter- 
mine it exactly, and it must be estimated instead. In addition 
to reviewing these various strategies, I describe connections 
between state discrimination, the manipulation of quantum 
entanglement, and quantum cloning. Recent experimental 
work is also discussed. 



I. INTRODUCTION 

The information which describes the state of a physi- 
cal system is that which is required to predict its future 
evolution and its effect on other systems. Among the 
other physical systems whose dynamics it can influence 
are measuring devices. The state then contains all infor- 
mation that can be extracted by measurement and thus, 
all information that we can acquire about the system. 

The state of a classical system is described by dynami- 
cal variables. For a one dimensional point particle, these 
are its position q and momentum p. If we have com- 
plete knowledge of the values of these variables, and also 
of the equations that describe their temporal evolution, 
then we can predict, with arbitrarily high accuracy, the 
future state of the system, and how it will interact with 
other systems. 

The assumption that we can measure these variables 
with arbitrarily high accuracy is, however, an idealisa- 
tion. In addition to the fact that the results of any real 
experiment will be affected by some uncontrollable noise, 
our measuring devices have finite precision and can only 
record a finite amount of information. This implies that 
they will not be able to record exactly the values of con- 
tinuous variables which form the basis of the descriptions 
of most classical systems. Often then, we must settle for 
an approximate description of the state. In our simple 
example of a one dimensional point particle, practical 
limitations on our ability to measure q and p precisely 
might lead us to use a joint probability density g(q,p) 
instead. Often, such a description is adequate. However, 
there do exist situations where even the smallest, finite 



uncertainty in our knowledge of the state variables will 
be amplified over time to such large proportions that long 
term prediction of the behaviour of the system becomes 
impossible. As is well-known, this extreme sensitivity 
to initial conditions is one of the chief hallmarks of the 
phenomenon of dynamical chaos Q . 

As a matter of principle though, there are no funda- 
mental limitations on the precision with which we can 
determine the state variables of a classical system. This 
is because the state variables are also observables. The 
amount of information that we can acquire about observ- 
ables is always increasing as more refined measurements 
become possible. This equivalence of state information 
and the information that is potentially accessible through 
measurement is a highly non-trivial feature of classical 
physics which, despite its transparent nature, should not 
be taken for granted. Classical physics is only an approx- 
imate description of our world, and currently, our most 
accurate description of it is provided by quantum the- 
ory. Here, states and observables are completely distinct 
concepts. 

The state of a quantum system is represented by a 
normalised vector in a complex, linear vector space. 
The observable properties of quantum systems, by con- 
trast, are the same as those of classical systems: position, 
momentum etc. Unlike their classical counterparts, they 
are not represented by simple numerical variables which 
evolve deterministically over time. Instead, they are rep- 
resented by Hermitian operators on the vector space. We 
shall make no distinction in what follows between observ- 
ables and their corresponding operators, and shall denote 
both aspects of a generic observable by fi. 

Knowledge of the state vector is instrumental in mak- 
ing predictions about the outcome of measuring an ob- 
servable f2. As is discussed in every introductory quan- 
tum mechanics text, e.g. ||], these predictions arc not 
generally of a deterministic nature, and are concerned 
instead with statistical quantities. The average, or ex- 
pectation value of for an ensemble of quantum systems 
all prepared with the same state vector \ip) is given by 
the inner product of \ip) and 



(n) = (m\ 



(1.1) 



In order to treat a single system, we express f2 in terms 
of its eigenvalues cu k and orthogonal eigenstates \u> k ): 



= 



k 



u) k \u) k )(u) k \. 



(1.2) 



The eigenvalues oj k are the values that the observable f2 
can take. Inserting Eq. (1.2) into Eq. (1.1), we obtain 
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(n) = 5> fe |<v>K>l 2 - (1.3) 

k 

The expectation value of any quantity is equal to the sum 
of the values that it can assume, multiplied by their re- 
spective probabilities. From this, we infer that the prob- 
ability P(ujk\ip) of obtaining the result u>k for the state 
is 

PMV)H(VK>I 2 - (1-4) 

If the result uj k is obtained, then the state of the system 
following the measurement will become \aj k ). This is the 
contentious process of state vector reduction, or 'collapse 
of the wavefunction'. Such a measurement is known as a 
von Neumann measurement. 

The vector \ip) is the most complete specification of 
the state of a quantum system. As is the case in classical 
physics, it is not always possible to specify the state ex- 
actly. By analogy with the probability density g(q,p), in 
dealing with quantum systems we will sometimes have to 
use a probability distribution for the state vector itself. 
This is described by a density operator, 

P = ^2p r \A)(A\- (1-5) 

r 

Here, p r is the probability that the state vector of the 
system is \ip r ), and ^2 r p r — 1- When a quantum system 
is known to have a specific state vector, it is said to be in 
a pure state. Otherwise, it is in a mixed state. The gen- 
eralisation of Eq. (1.4) for the probability of obtaining 
result LOk when the system is in a possibly mixed state p 
is 

P(uJk\p) = (uik\p\uk) = Trp|w fe )(wfc|. (1.6) 

Here, 'Tr' stands for the trace operation. The trace of 
an operator is, in a matrix representation, the sum of its 
diagonal elements. The numerical value of this sum is 
basis independent. So, if we choose the basis to be that 
formed by the eigenstates \u>k) of O, then the trace is 
seen to be the sum of the corresponding eigenvalues Uk- 
One can quite easily show, using Eq. (1.5), that Trp = 1. 
From Eq. (1.6), it follows that the expectation value of 
is 

(O) = ^2u k {uj k \p\uj k ) = Trpfl (1.7) 

k 

The evident distinction between states and observables 
in quantum mechanics begs the question: to what extent 
can we determine the state of a quantum system? The 
ability to do this would confer many benefits. The most 
obvious of these is that we would be able to evaluate the 
probability distribution for the results of any measure- 
ment that we might wish to perform upon a quantum 
system about whose state we have no prior information. 
Another less obvious benefit, which we shall later see, is 
that it would allow two parties to communicate across 



arbitrarily large distances, instantaneously, in violation 
of the special theory of relativity. 

The state itself is not an observable in quantum me- 
chanics. As it happens, the impossibility of measuring 
the state has benefits of its own. As was demonstrated 
initially by Bennett and Brassard the impossibility 
of measuring precisely permits the existence of prov- 
ably secure protocols for the transmission of confidential 
information. The security of quantum cryptographic pro- 
tocols, unlike that of classical ones, is a consequence of 
physical theory j|,|5| . The security of classical protocols 
depends upon unproven assumptions about the complex- 
ity of the decoding problem. One of the most widely used 
cryptosystems, the RSA cryptosystem, exploits the dif- 
ficulty of the problem of reducing a large number to its 
set of prime factors ||. No efficient classical algorithm 
for carrying out this task has been discovered. However, 
it has not been proven that one does not exist. In fact, 
it has recently been shown by Shor that an efficient al- 
gorithm does exist for quantum computers ]7],|| . 

In this article, we will examine the problem of deter- 
mining the state of a quantum system. Although the 
state is not, strictly speaking, an observable, through 
a judicious choice of legitimate observables, we can ob- 
tain information about it. Several strategies for state 
discrimination exist. The one we would use in any par- 
ticular situation depends upon the type of information 
about the state we wish to obtain, and also on any prior 
information that we might possess. 

The first strategy we examine is quantum hypothesis 
testing. Here, we are given a system whose state belongs 
to a known, finite set. Our aim is to guess, with the 
minimum probability of error, which of these states the 
system is in. 

In the course of our discussion, we shall encounter the 
elegant formalism of generalised quantum measurements. 
This is based on a few necessary and sufficient conditions 
which any physically possible operation on a quantum 
system must satisfy. 

Sometimes errors can be avoided altogether if we al- 
low for the possibility of inconclusive results. We will 
see how this can be achieved. Another scenario is when 
the state does not belong to some finite, known set, and 
can be any state in the entire vector space. Under these 
circumstances, the set of possible states is infinite. Since 
any measurement can record only a finite amount of in- 
formation, it is necessary to consider only a finite, but 
suitably large set of possible states. These states are 
known as guess states. An appropriate measurement 
strategy is one which selects the guess state which the 
actual state most closely resembles, as often as possible. 
This is known as quantum state estimation. 

We shall also explore the relationship between trying to 
discriminate between quantum states and other matters, 
such as the problem of cloning quantum states, and the 
manipulation of nonlocal correlations between quantum 
systems, which can exist due to the strange phenomenon 
of quantum entanglement. 
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II. QUANTUM HYPOTHESIS TESTING 
A. Basic strategy 

In this article, we shall see that several approaches to 
the problem of quantum state discrimination exist. All 
of them refer to the same basic scenario. One party, con- 
ventionally called Alice, prepares a quantum system in a 
member of a set of quantum states. She might not pre- 
pare all of these states with the same probability. She 
then passes the system onto her colleague, Bob. His task 
is obtain as much information about the state which she 
prepared as he possibly can. Here, we use the term 'in- 
formation' in a broad sense, and do not refer as yet to 
any specific measure. The main differences between the 
state discrimination strategies considered in this article 
correspond to the different types of information that Bob 
might wish to obtain. 

In this section, we will assume that some information 
about the state is given to Bob. He is told what the 
possible states of the system are, and he is also told the 
probability that the system was prepared in each of them. 
Here, we will consider only the situation where there are 
N possible states, represented by density operators pj, 
where j — 1, . . ., N, for some finite N. Bob is also told 
the probability, rjj , that the system was prepared in each 
of them. These probabilities, known as the a priori prob- 
abilities, satisfy 

N 

$>i = l. (2-1) 

3=1 

since the system will, with certainty, be prepared in one 
of the states pj. 

Historically, the first strategy for state discrimination 
was that advanced by Helstrom ||. This strategy is 
known as quantum hypothesis testing. In his attempt 
to determine the state of the system, Bob performs some 
measurement. The key feature of quantum hypothesis 
testing, as opposed to some other strategies, in particu- 
lar, the one we shall examine in the next section, is that 
after his measurement, he is required, on the basis of his 
experimental results, to make a decision as to what the 
state was. He is not allowed to say 'don't know'. We will 
see that, if the states are not orthogonal, then no test 
exists which allows him to guess correctly all of the time, 
so that there will, in general, be a non-zero probability 
of error, which we shall denote by Pe- Likewise, we will 
write the probability of correctly determining the state 
&sP D = l-P E . 

Since there are N states, his experiment must have N 
outcomes, which we call tok- Following this kind of test, if 
Bob obtains the result ojj, he makes the hypothesis that 
the state given to him by Alice was pj . 

To determine the probability of error, Bob needs to 
know the a priori probability rjj of being given the state 
Pj and the probability, given that pj was sent, that he 



will obtain the result u>k, for all j, k. The probabilities 
form the channel matrix [P{u)k\pj)\. The elements of this 
matrix satisfy the completeness condition 

N 

Y J P{uk\p 3 ) = l. (2.2) 

fe=i 

This expresses the fact that, no matter which pj Bob 
receives, his measurement will, with certainty, yield one 
of the outcomes u>k- The total error probability Pe is 
found to be 

N 

Pe = 1-Pd = 1-Y, VjP^APj)- (2-3) 

3=1 

Quantum hypothesis testing actually belongs to a more 
general class of strategies known as quantum Bayes' 
strategies ||. The general quantum Bayes strategy as- 
signs a cost Ckj to making hypothesis u>k when the state 
was pj. The coefficients Ckj are known as the elements 
of the Bayes ' cost matrix. The scenario can easily be un- 
derstood in terms of gambling. Alice sends Bob one of 
the states pj. If Bob says 'ti^', then he must pay Alice 
Ckj currency units. Some elements of the cost matrix 
can be negative, in which case Alice pays Bob, enabling 
him to win money. The average amount that Bob will 
pay Alice is then given by the Bayes' cost function 

C B = Y,r h C h jP{u k \p j ). (2.4) 

jk 

For a fixed cost matrix [Ckj] and a priori probabilities r]j, 
Bob's task is to minimise the overall Bayes' cost Cb, that 
is, to use a measurement which minimises his average 
payout to Alice. The only quantities that Bob is free to 
vary are the channel matrix elements P(u>k\pj). Since 
the possible states pj are fixed, the only thing that Bob 
is free to vary is his measurement strategy. 

The form of the cost matrix depends on the particular 
situation. In general, some errors may be more costly 
than others. If every error has equal cost, then the cor- 
responding Bayes' cost is closely related to the average 
error probability. There is no cost when the result is cor- 
rect, so the diagonal elements of the cost matrix, Cjj, 
are zero. Let all other elements have an associated cost 
c: that is, Cjk — c for j^k. Then one can show, using 
the definitions of the error probability and Bayes' cost in 
Eqs. (2.3) and (2.4), and the completeness condition in 
Eq. (2.2), that the Bayes' cost and error probability are 
related by Cb = cPe- When all errors have the same 
cost c, minimisation of the Bayes' cost is equivalent to 
minimisation of the error probability. 

The lowest value of Pe is obtained by varying the ele- 
ments of the channel matrix, P(u>k\pj). As a consequence 
of the non-trivial nature of the measurement process in 
quantum mechanics, the form of this matrix cannot be 
specified arbitrarily. Consider for example a von Neu- 
mann measurement of an observable Q. Let us assume 
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that the states pj are pure states \ipj)(ipj\- We would like 
to associate each outcome with a unique state, so that 
if Bob obtains the result ojj he will make the hypothesis 
that the state Alice sent him was Clearly, the num- 
ber N of possible states must be equal to the number 
of outcomes. Consequently, we must also then assume 
that the are linearly independent. Otherwise, there 
would be more states than outcomes. 

In the Introduction, we saw that each outcome of a 
von Neumann measurement corresponds to a different 
eigenvalue of an Hermitian operator fl. The eigenvalues 
uij are the possible numerical values of f2, considered as 
an observable property of the system. From Eq. (1.4), 
we see that the channel matrix elements are given by the 
square-overlaps between the \ipj) and the eigenstates of 
il: 



PKi^) = |(^hb>l 2 



(2.5) 



The diagonal elements of this matrix must all be equal to 
1 if the error probability is to vanish. This clearly gives 



the requirement that 



J 3h 



which cannot be the 



case if the ji/'j) are non-orthogonal. 

A simple von Neumann measurement of this kind, how- 
ever, is often the most useful for the kind of strategy 
we are considering. In fact, it was proven in 1973 by 
Kennedy |l(| that if we are attempting to distinguish be- 
tween N pure states which are linearly independent, as 
we have been assuming, then there is always a von Neu- 
mann measurement which is optimal, in the sense that it 
can be used to obtain the smallest possible value of Pg. 
It follows that only orthogonal states can be perfectly 
discriminated. 



B. Hypothesis testing for two pure states 

The simplest set of linearly independent pure states, 
and historically the first set for which an explicit expres- 
sion for the minimum error probability was obtained, is 
that of just two states. The problem of finding the min- 
imum value of Pe for two pure states, which we shall 
simply simply by \i/j±), was solved by Helstrom 0j and 
can be considered to be a pioneering work in quantum 
detection theory. Helstrom's optimal value of Pe is 

JMopt) = \ (i - Vi-^ + v-\{^ + H-)\ 2 ) • (2-6) 

Naturally, we would like to determine the von Neumann 
measurement which can be used to attain this limit. The 
corresponding basis states \w±) have quite a simple form 
if we take the states \ip±) to be 



\4>±) = cos6 



}±sin(9| -' 



(2.7) 



for some angle 0<6><7r/4, and where |±) is an orthogonal 
basis for the space spanned by \ip±). For the states \ip±) 
in Eq. (2.7), the optimum detector states \u±) are 



V2 



v^±li+)±yTrei- 



(2.8) 



Here, £ = Acos2#/ + cos 2 26>(A 2 - 1) where A = r?+ - 
?7_. For alternative expressions for the optimal detector 
states, see The optimum detection strategy for a 

pair of mixed quantum states has also been obtained. 
For a full discussion, see [{T^fl . 

For two pure states with equal a priori probabilities 
7/+ = i]- = 1/2, the optimum detector states in Eq. (2.8) 
have a much simpler form, and the optimum measure- 
ment has appealing geometrical properties. When the a 
priori probabilities are equal, we have A = 0, which in 
turn implies that £ = 0. The states \lo±) are then given 
simply by 



+ >±l 



V2 



(2.9) 



and the minimum error probability simplifies to 
P E (opt) = [l-(l-|(-(/' + |V'-)| 2 ) /2 ]/2. All 4 states in Eqs. 
(2.7) and (2.9) are depicted in Figure 1. The symmetrical 
properties of the measurement states \to±) with respect 
to the possible states \tp±) are clearly visible in the fig- 
ure. The \uj±) are as close as they can be to the \i/j±) 
whilst maintaining orthogonality. The reflection symme- 
try about the | + )-axis is due to the equality of the a 
priori probabilities rj±. We can also see from the figure 
that errors are unavoidable, since \u>±) is not orthogonal 
to 

The Helstrom measurement has recently been carried 
out in the laboratory by Barnett and Riis Jl3| |. In this 
experiment, the two states were non-orthogonal photon 
polarisation states, having the form shown in Eq. (2.7), 
where the orthogonal states |+) and |— ) were the horizon- 
tal |<->) and vertical |J) polarisation states respectively. 
The experimental arrangement used is shown in Figure 2. 
Pulses of light emerged from the left in the horizontally 
polarised state |<->). These were then heavily attenuated 
to the point where, on average, only 1 in 10 pulses con- 
tains a photon. This was done to make the probability 
of there being 2 or more photons per pulse negligible. A 
Glan-Thompson polariser GTP was then used to trans- 
form the photons into one of the states \ip±) in Eq. (2.7). 
The beam was then analysed at a polarising beam split- 
ter PBS oriented at an angle of 7r/4 to the horizontal. 
To understand the action of this beamsplitter, we refer 
to the states \u>±) in Eq. (2.9), and again make the iden- 
tifications | + ) = and | — ) = ||). A photon in 
the state |k>+) would be transmitted by the beam split- 
ter, while would be reflected. The transmitted and 
reflected states were fed to photodetectors D + and ir- 
respectively. Correct results were obtained when a pho- 
ton prepared in the state \ipj) was detected at Dj, where 
j = ±. If the photon was detected at the other 'wrong' 
detector, an error ensued. 

In the Barnett-Riis experiment, both states had equal 
a priori probabilities. The minimum error probability is 
then 
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Pis (opt) = ~(1 - sin20). (2.10) 

Experimental results for the error probability for various 
values of 9 are shown alongside the theoretical minimum 
in Figure 3. 

Although the problem of finding the minimum error 
probability P_E(opt) for two states has been solved com- 
pletely, it is generally difficult to find analytic expressions 
for more than two states. Standard von Neumann mea- 
surements of the kind which are optimal for two pure 
states, or indeed, any number of linearly independent 
states, cannot detect all of the states if they form a lin- 
early dependent set. If the state space of the system is N 
dimensional, then a von Neumann measurement can have 
at most N outcomes. If there are more than N states, 
then some of these cannot be detected by the measure- 
ment. 

Fortunately, the formalism of quantum mechanics does 
not restrict us to state transformations described by von 
Neumann measurements. This much is obvious from the 
transformation generated by the free evolution of a quan- 
tum system. This is described by the Schrodinger equa- 
tion, and is quite unlike what happens during a measure- 
ment. In order to decide whether or not a given operation 
on the quantum state is physically realisable, it would be 
helpful to know what the general criteria are. These are 
firmly established, and form the basis for the elegant for- 
malism of generalised quantum measurements |14| , which 
we now describe. 



C. Generalised measurements 

Consider a quantum system initially prepared in the 
state p. An operation p—>L(p) is carried out of the sys- 
tem. This operation has K distinguishable outcomes 
which, as before, we label ujk 1 k = 1, . . ., K, with corre- 
sponding final density operators p' k . In a von Neumann 
measurement, the probability of outcome uj k is given by 
Eq. (1.6), that is, the trace of the product of the initial 
density operator and |wfe)(a;fc|. In the more general kind 
of measurement we describe here, the latter operators are 
replaced by more general operators, known as quantum 
detection operators, life. By analogy with Eq. (1.6), the 
probability of obtaining result ui k given the initial state 
p, is 

P(u h \p) = TrpIIfc. (2.11) 

If p is a pure state \xj})(ip\, then this probability P(oj k \%l)) 
is simply 

PMV) = Mn fc |v>. (2.12) 

Naturally, P(uik\ip) is always real. This implies that 
the quantum detection operators must be Hermitian. 
This probability must also be non- negative for all states. 



Thus, the expectation value of life must always be non- 
negative. Operators whose expectation values are non- 
negative for all possible states are said to be positive 
(semi-definite) . They may be cquivalently defined as op- 
erators whose eigenvalues are non-negative. One further 
constraint on the form of these operators comes from the 
requirement that the possible outcomes Lo k are exhaus- 
tive, which implies that J^k P(u)k\p) = 1 for all possible 
states. From this, it follows that the life form a resolution 
of the identity, 

The conditions we have just given are the necessary 
and sufficient conditions for the realisability of an ex- 
periment whose outcomes have the probability distribu- 
tion P{u!k\p) p[ . Such an operation is also commonly 
known as a positive operator-valued measure operation, 
or POVM, and the detection operators are called the el- 
ements of the POVM. 

Often, and particularly in state discrimination, we are 
only interested in these probabilities, and not overly con- 
cerned about how the state of the system is transformed 
by the measurement. However, this is not always the 
case, indeed we shall be concerned about this issue in 
section IV. It is then useful to know what form this state 
transformation must take. To this end, consider the op- 
erator 

A k = U k ILl /2 , (2.14) 

where U k is any unitary operator. From this expression, 
and from the fact that UtUk = 1, we can see that life = 
A^Ak, and that the detection probability P(u>k\p) can 
be alternatively expressed as TrA\,A k p. It can also be 
expressed as Tr A k pA\, since the trace of a product of 
operators is invariant under cyclic permutations. The 
post-measurement density operator, given that result oj k 
was obtained, is 

The presence of the probability in the denominator serves 
to give Trp' fc = 1, normalising the state. If we do not 
actually record the result of the measurement, then the 
final density operator, which may simply be denoted by 
p' , is given by a distribution of the density operators p k 
corresponding to the possible outcomes of the operation, 
weighted by their respective probabilities P(u>k\p). That 
is 

p' = Y,P("k\p) Pk = Y, A kP A l (2-16) 

k k 

The formalism we have outlined appears to be more gen- 
eral than the description of quantum state changes given 
in introductory quantum mechanics texts. There, usually 
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only unitary operations and von Neumann measurements 
are discussed. It is easy to see that these operations are 
special cases of the generalised measurements we have 
just described. A von Neumann measurement of an op- 
erator f2 with the orthogonal eigenstates \uk) can be ex- 
pressed in terms of the operators Pk — \u>k)(u>k\- These 
are projection operators, and are clearly Hermitian and 
positive, having eigenvalues and 1. They also form a 
resolution of the identity 

which expresses the completeness of the orthogonal basis 
|cjfc). They are also idempotent, that is, P| = Pk- It 
follows that the projectors Pk satisfy the properties re- 
quired for them to be physical transformation operators 
Ak- Applying Eq. (2.15), we see that the possible post- 
measurement states are just the eigenstates |wfc), in ac- 
cordance with the idea of collapse of the wavefunction. If 
the measurement result is not recorded, substituting the 
Pk for Ak in Eq. (2.16) just gives a statistical mixture of 
the eigenstates, weighted by their respective probabilities 
P("h\p). 

At the other extreme, if only one Ak, which we may 
just call A, is non-zero, then the resolution of the iden- 
tity in Eq. (2.13) implies that A^A = 1, i.e. that A is 
unitary. Equations (2.13) and (2.16) are both equivalent 
here, since there is only one 'outcome', which represents 
unitary evolution according to the von Neumann equa- 
tion 

in^ = [H,p], (2.18) 

where [H, p] is the commutator Hp — pH. This is simply 
the generalisation of the Schrodinger equation to cover 
mixed states. The solution is p(t) = U p(Q)U\ where U — 
e k . Any unitary operator U can be written in this 
form for some Hamiltonian H. So, if we are sufficiently 
able to tailor the Hamiltonian H of our system, then we 
can generate any unitary evolution. 

Despite its appearance, the generalised measurement 
formalism is not really more general than these two more 
familiar types of state transformation. An important re- 
sult, known as the Naimark theorem |ll|, tells us that 
any generalised measurement can be realised with an an- 
cillary system, a unitary operation and a von Neumann 
measurement. Specifically, if we wish to realise a gener- 
alised measurement with K outcomes, we need a large 
ancillary system. The system of interest is then made to 
interact unitarily with the ancilla. In general, this results 
in the original system and the ancilla becoming entangled. 
Entanglement is a feature of quantum mechanics we shall 
examine at greater length in section IV. Following this in- 
teraction, a von Neumann measurement is performed on 
the ancilla. As a consequence of this entanglement, this 
measurement also transforms the state of our original sys- 
tem, and the results of this measurement give rise to the 



corresponding transformations in Eq. (2.15). The effect 
of a generalised measurement, implemented by a unitary 
interaction with an ancilla, followed by a measurement 
on the latter, is illustrated in Figure 5. 

D. Hypothesis testing for multiple states 

Returning now to the problem of state discrimination, 
Bob has in his possession a quantum system prepared in 
one of the N states pj , with a priori probabilities rjj . His 
aim is to determine the strategy with N outcomes whose 
detection operators give the minimum value of the 
error probability. If outcome ujj is taken to correspond 
to detection of the state pj, then the minimum value 
of the error probability is obtained from the fact that 
the probability of correctly identifying the state pj will 
be TipjHj. The sum of these probabilities for the N 
states pj , weighted by their a priori probabilities rjj , gives 
the total probability Pd that the state will be correctly 
identified. The error probability Pe is equal to 1 — Pd, 
giving 

I'r 1 ^///rrll./-,. (2.19) 

3 

Holevo ]l6| and Yuen et al independently determined 
the necessary and sufficient conditions that a set of de- 
tection operators must satisfy to give the minimum value 
of Pe- These are 

n jfePi- VkPk]^k =0, (2.20) 
r - VjPj > o, (2.21) 

where we have defined an operator T known as the La- 
grange operator 

T = Y / VkTikPk, (2.22) 

fe 

which, as a consequence of the condition in Eq. (2.20), 
is Hermitian (this can be seen by summing Eq. (2.20) 
over both j and k, and making use of the resolution of 
identity in Eq. (2.13).) One important kind of ensemble 
of states for which the optimum strategy can be derived 
analytically are pure states with equal a priori probabili- 
ties, rjj = 1/N, which are also symmetric |9|Jlq|. A set of 
states is symmetric if it satisfies the following conditions: 

\1> j ) = U\1> j -i) = US- 1 \1> 1 ), (2.23) 
U\iP N ) = (2.24) 

for some unitary operator U. We see that U transforms 
each state into its successor, and the final state back to 
the initial state. The optimum measurement for these 
states is the so-called square-root measurement. We de- 
fine the operator: 
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3=1 



(2.25) 



The optimum detection operators IT are of the form 



(2.26) 



where the, in general unnormalised, and non-orthogonal 
states are given by 



(2.27) 



It is because of the presence of fp" 1 / 2 on the right hand 
side of this expression that this measurement is known 
as the square-root measurement. For equally-probable 
symmetric states, this measurement attains the minimum 
error probability 



1 N 



/2 



(2.28) 



The simplest set of symmetric states is that of just two 
states, which we examined above. Looking back at Eq. 
(2.7), we see that if the orthogonal basis states |±) are 
to be regarded as the spin up/down states of a spin-1/2 
particle with respect to the z-axis, then the z compo- 
nent of the spin vector, tr z , acts on these states to give 
<t z \±) = =f|=f). Using this property, it can easily be seen 
that a z \ip±) = IV't)- This operator is unitary and satis- 
fies al — 1, so these states satisfy the symmetric states 
conditions in Eqs. (2.23-2.24). 

The next simplest case is that of three states. Three 
symmetric photon polarisation states are 



IV>i> = H, 
life) = =^ 



(2.29) 
(2.30) 

(2.31) 



These states are illustrated schematically in Figure 5, 
where we see that they are distributed around a circle, 
with equal angular spacing of 27r/3 radians. This ensem- 
ble of states is sometimes called the trine ensemble. If 
these states have equal a priori probabilities, then the 
minimum error probability is equal to 1/3, and the opti- 
mum strategy is given by the detection operators 



IT 



(2.32) 



The first method of carrying out the optimum state dis- 
crimination measurement for a trine ensemble of pho- 
ton polarisation states was proposed by Sasaki et al [jl9| . 
Phoenix et al [^0| explored the potential of this ensemble 
of states in quantum cryptography, showing that, for a 
certain, novel three state key distribution protocol, it is 



the trine ensemble which can be used to generate secret 
key bits most efficiently. 

The square-root measurement for three symmetrical 
photon polarisation states can be performed with current 
technology. Indeed, at the time of writing, experimental 
demonstrations of both this measurement, and also the 
more complex minimum error probability discrimination 
of four non-coplanar states arranged as a tetrahedron, 
have just been carried out by Clarke et al | pl[ . 



III. UNAMBIGUOUS STATE DISCRIMINATION 

A. Error-free discrimination between two 
non-orthogonal states. 

As we have seen, the formalism of generalised measure- 
ments offers greater scope for the possibility of discrimi- 
nating between non-orthogonal quantum states than sim- 
ple von Neumann measurements. One of the main ad- 
vantages conferred by generalised measurements is the 
fact that the number of distinguishable outcomes can 
be arbitrarily large. The number of outcomes possible 
with a von Neumann measurement is restricted to be no 
greater than the number of dimensions of the system's 
state space. This means that quantum hypothesis test- 
ing, with a full set of outcomes corresponding to each of 
the states, is not generally possible with von Neumann 
measurements, in particular, if the states are linearly de- 
pendent. As we saw in the case of the trine ensemble, it is 
necessary under such circumstances to use a generalised 
measurement. 

Being able to perform measurements with an arbitrary 
number of outcomes, we might ask ourselves if there is 
anything to be gained if not every outcome need corre- 
spond to the detection one of the states. In other words, 
do we gain anything if we allow our measurement to have 
inconclusive results? This issue was first examined in 
1987 by Ivanovic [p2[ , who made the startling discov- 
ery that the possibility of occasionally obtaining incon- 
clusive results permits error free discrimination between 
non-orthogonal states. Ivanovic showed that when the 
result of this measurement is not inconclusive, it is al- 
ways correct. 

To see how this may be done, consider again the two 
states \ip±) defined in Eq. (2.7). Let us now introduce 
the additional states 



|V>±) = sin0| + )±cos0| + ). 



(3.1) 



Notice that |^rjr) is orthogonal to \ip~), and likewise with 
|^_) and \ip+)- Consider now a generalised measurement 
described by the following detection operators: 



P: 



n? = i - n+ - n_. 



(3.2) 
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The meaning of the coefficients P± will become appar- 
ent shortly. Since |^+) is orthogonal to \ip-), we see 
that (^-In+I^-) = 0, and so the probability of obtain- 
ing the result '+' for this state must be zero. Likewise, 
(■0+|n_| , + ) = 0, so we will never obtain the result '— ' 
for the initial state Thus, whenever we obtain one 

of these two results, we can retrodict exactly what the 
initial state was. We can easily see that 



W>±|n±|v>±> = p±. 



(3.3) 



This means that P± is the probability, given that the 
system was prepared in the state \ip±), that this state 
will be identified unambiguously. 

Unless the states are orthogonal, these probabilities 
cannot attain the value of 1. There is a third result, the 
inconclusive result '?', the probability of which for each 
state is equal to the expectation value of the operator 
n? . As with the quantum hypothesis testing strategy we 
discussed in the preceding section, it is important to op- 
timise this measurement. This means obtaining the max- 
imum unambiguous discrimination probability, or equiv- 
alently, the minimum probability of inconclusive results. 
To obtain this, we must know the a priori probabilities 
rj± of the two states. Given these, the total probability 
P? of obtaining an inconclusive result is 



P? 



1 



(3.4) 



}=+,- 



The variational problem whose solution is P?(opt) es- 
sentially consists of determining the values of P± which 
minimise Eq. (3.4) subject to the constraint that the 
operator II? is positive. For two states with equal a pri- 
ori probabilities rj± = 1/2, it was established through 
the work of Ivanovic j22j, Dieks Q and Peres Q that 
the minimum attainable value of the inconclusive result 
probability is given by 



p ? (opt) = |<v>+|V->|. 



(3.5) 



This Ivanovic-Dieks-Peres (IDP) limit is obtained when 
P + and P- are both equal to 1 — |(^ + |^_)|, which is 
also the total probability of obtaining a conclusive, cor- 
rect result. A more general bound was later obtained by 
Jaeger and Shimony p5|| , which solves the problem for 
unequal a priori probabilities. A particularly illuminat- 
ing discussion of the Jaeger-Shimony result in the context 
of quantum communications was given by Ban ]2q] 

It is important to understand what happens to the 
state of the system when an inconclusive result is ob- 
tained. It might be tempting to imagine that this out- 
come is of little importance, since we could repeat the 
measurement. Unfortunately, this is not the case. If an 
inconclusive result is obtained, then the states \tp±) un- 
dergo a transformation. In general, for a fixed pair of 
states, the lower P? is for the measurement, then the 
closer to each other, in terms of their overlap, the possi- 
ble states will be after the transformation. As Pi reaches 



P?(opt) in Eq. (3.5), both states are transformed into 
the same state, rendering any further attempt to dis- 
criminate between them futile. Unambiguous discrimi- 
nation can then be regarded as a kind of gambling opera- 
tion. The states \ip±) are distinguishable to some extent, 
though not completely so. We can gamble this partial 
distinguishability in the hope of obtaining complete dis- 
tinguishability, and will succeed with probability 1 — P? . 
If we lose, however, the states become less distinguishable 
than they were initially. 

As with the measurements described in the preced- 
ing section, photon polarisation states are ideally suited 
to the experimental realisation of this measurement [£7j . 
The two states can be represented as 



\ip±) =cos#H)±sin(?|< 



(3.6) 



where again, 0<(9<7r/4. Consider now the interferomet- 
ric setup depicted in Figure 6. A photon prepared in one 
of these states enters polarising beamsplitter PBS1. This 
is oriented so as to transmit photons which are horizon- 
tally polarised, and reflect the vertically polarised ones. 
The vertical polarisation component travels up to the 
ordinary beamsplitter BS, which has a transmission co- 
efficient 



t 



v / cos26> 
cos# 



(3.7) 



If a photon is transmitted here, then it will result in a 
count at detector £>?, and give an inconclusive result. 
The probability of this occurrence is the product of the 
probability that the photon is vertically polarised (to en- 
able it to travel along the upper branch of the interfer- 
ometer), which, from Eq. (3.6), is given by cos 2 #, and 
the probability of transmission, given by t 2 . The result is 
simply cos2# = | ("0+1 V*-) \i which is the minimum proba- 
bility in Eq. (3.5). 

An inconclusive result is not obtained if this compo- 
nent is reflected at BS, in which case it will encounter a 
second polarising beamsplitter PBS2, which, like PBS1, 
transmits horizontal and reflects vertical polarisation. In 
the absence of a detection at PBS1, the photon will 
emerge from PBS2, its state having undergone the trans- 
formation 
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(II>±M). 



(3.8) 



These states are orthogonal, and can be distinguished 
using a third polarising beamsplitter PBS3. As in the 
Barnett-Riis experiment discussed in the preceding sec- 
tion, a polarising beamsplitter, here PBS3, oriented at 
7r/4 to the horizontal can be used to deflect the photon 
to one of the detectors D + and D_ only when its initial 
state was or respectively. The wrong path is 
never taken, so that when successful, the discrimination 
attempt will always give the correct answer. 

While technically feasible, the experimental apparatus 
shown in Figure 6 would have to be stabilised and aligned 
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with extreme accuracy. This difficulty was overcome in 
an ingenious variation of this experiment reported in 1996 
by Huttner et al (^tJ. An interferometer, such as that in 
Figure 6, or other device where photons with different 
polarisations travel along different paths, might seem to 
be essential for this type of experiment, since the non- 
orthogonal initial states are transformed into orthogonal 
ones using polarisation-dependent losses (PDL). What 
Huttner and collaborators realised was that the same ef- 
fect can be achieved with all photons travelling along the 
same path if the medium through which they propagate 
itself has PDL. Using an optical fiber with this property, 
they performed the experiment using highly attenuated 
optical pulses (~0.1 photons per pulse). For 9 = ir/6, 
they obtained an error rate of 1.7%. Comparing this with 
the minimum error probability in the Helstrom measure- 
ment, which is approximately 6.7%, this measurement 
shows a clear improvement. 

One disadvantage of using a PDL fiber is the fact that 
photons lost do not go to a detector, they simply do not 
register anywhere. Thus, the occasions when photons 
fail to result in a click at cither D + or Z?_, which we 
would like to interpret as being caused by inconclusive 
results, cannot be distinguished from those null events 
due to the weakness of the pulse. However, this does 
not detract from the fact that the error probability they 
obtained for detected photons was significantly less than 
the Helstrom bound. 

Nevertheless, it is important to distinguish both kinds 
of null result. At the time of writing, an experimental 
realisation of the 1DP measurement which follows the 
scheme shown in Figure 6 more closely, and produces the 
optimal theoretical proportions of conclusive and incon- 
clusive results, has just been carried out by Clarke et al 

B. Unambiguous discrimination between linearly 
independent states 

A naturally intriguing question is: how can this type of 
measurement be generalised to more than two states? We 
would then consider a set of N quantum states \tpj), with 
j = 1, . . ., N. Correspondingly, we would have detection 
operators Hj satisfying 

^ r \n j \^ jl )=P j 5 jf . (3.9) 

This condition says that outcome j can only occur when 
the initial state is \ipj)- The conditional probability, given 
that the system was prepared in this state, that it will 
be successfully identified, is Pj. There will also be a fur- 
ther detection operator n? corresponding to inconclusive 
results. 

In p9| ] , I showed that such a measurement strategy can 
only exist if the states \ipj) are linearly independent. For 
N linearly independent states, the form of the detection 



operator H, is an immediate generalisation of that for 
the two state case, given by Eq. (3.2): 

P N 

Uj = Wk)? ^ ){ ^ n^-E 11 .- ( 3 - 10 ) 

3 j — 1 

The normalised state is defined as that which is 

orthogonal to all \ipji) for jj^j' . Up to an overall phase, 
this state is unique and known as the reciprocal state. The 
relationship between the reciprocal states to the original 
states is exactly analogous to that in crystallography 
between the reciprocal vectors and the Bravais lattice 
vectors, where each member of the former set of vectors 
is orthogonal to all but one member of the latter set |30| . 

As we saw in the case of unambiguous discrimination 
between just two states, it is important to examine how 
the possible initial states are transformed when an in- 
conclusive result is obtained. Optimal unambiguous dis- 
crimination measurements on a pair of states transforms 
them into the same state if the measurement fails. It is 
shown in [^9| that for a set of N linearly independent 
states, a failure will transform them into a linearly de- 
pendent set, making any further attempt to discriminate 
between them without errors impossible. 

Nevertheless, an inconclusive result does not necessar- 
ily erase all of the information about the state (except in 
the case N = 2), and it is still possible to obtain some 
information. If the unambiguous discrimination attempt 
fails, one can still carry out the quantum hypothesis test- 
ing strategy, described in the preceding section, on the 
resulting linearly dependent states. 

An important question is whether or not one can ob- 
tain an analytic expression for the minimum probability 
of inconclusive results for more than two states. By anal- 
ogy with the two state case, we take the state \ipj) to have 
a priori probability rjj , and see that the total probability 
of inconclusive results is given by 

N 

As in the two state case, the variational problem con- 
sists of determining the Pj which minimise P> subject to 
the constraint that the inconclusive result operator n? is 
positive. 

Like the quantum hypothesis testing strategy, it is 
difficult to find explicit solutions for this optimisation 
problem for unambiguous discrimination with arbitrary 
states. However, also as with the hypothesis testing strat- 
egy described in the preceding section, the problem is ex- 
plicitly soluble for equally probable symmetrical states, 
which satisfy Eqs. (2.23-2.24) ||l]]. In fact, at the time 
of writing, this is the only known solution for more than 
two states. 

This solution is expressed in terms of a special rep- 
resentation of these states. Here, of course, we are con- 
cerned with symmetrical states which are also linearly in- 







dependent. Note that these are completely distinct con- 
ditions. The trine ensemble of three symmetric states of 
a two-level system, discussed in the preceding section, is 
clearly linearly dependent. If the are both linearly 
independent and symmetric, then they may be written 
as 




(3.12) 



for some coefficients c k and orthonormal states In 
fact, the \k) are the eigenstates of the unitary operator U 
in Eqs. (2.23) and (2.24). The coefficients c k satisfy the 

normalisation condition X^fcLi \ c k\ 2 = 1- If the have 
equal a priori probabilities r)j = l/N, then the minimum 
value of P? is given by 

P?(opt) = Nx min \c k \ 2 . (3.13) 

k 

For optimum unambiguous discrimination between lin- 
early independent symmetric states, it turns out that the 
conditional probabilities Pj are equal for all states, and 
therefore, from the equality of the a priori probabilities 
that we have been assuming, equal to 1 — P?(opt). This 
measurement has been found to have some novel appli- 
cations. For example, it has been shown by Dusek et 
al p2]| that the possibility of such a measurement has 
worrying implications for quantum cryptography. These 
authors showed that for realistic implementations of the 
first quantum key distribution protocol, devised by Ben- 
nett and Brassard in 1984 (BB84), the use of this mea- 
surement as an eavesdropping strategy can render the 
protocol insecure for certain detector efficiencies. 

The optimisation problem for more than two states 
has also been examined by Peres and Terno |33|. These 
authors gave a particularly detailed examination of the 
geometry and topology of the set of detection operators 
for 3 states, and showed how their method can be gener- 
alised to an arbitrary number of states. 

In the following section, we shall look at a further in- 
teresting application of unambiguous discrimination be- 
tween symmetric states, which relates to the manipula- 
tion of quantum entanglement. 

IV. STATE DISCRIMINATION AND 
ENTANGLEMENT 

A. Entanglement and quantum correlations 

Recently, much attention has been paid to a peculiar 
type of correlation between quantum systems known as 
entanglement. In this section, we shall examine some of 
the main properties of entanglement, placing particular 
emphasis on those which are related to state discrimina- 
tion. We shall begin by describing the type of nonlocal 
correlations which can occur between systems which are 



entangled and show how, if it were possible to discrimi- 
nate between arbitrary quantum states, then this could 
be used to transmit information across large distances 
instantaneously, in violation of the special theory of rel- 
ativity. 

Entanglement is produced when quantum systems in- 
teract with one another. If an operation on a pair of 
quantum systems involves no interaction between them, 
then it may be implemented as a series of distinct oper- 
ations on the individual component systems, otherwise 
known as local quantum operations, perhaps together 
with classical communication between the agencies in 
possession of the components. 

It is widely acknowledged that the fundamental prop- 
erties of entanglement are that it is invariant under lo- 
cal unitary quantum operations and cannot, on average, 
increase under arbitrary local quantum operations and 
classical communication ]34||35|]. However, if we have a 
state which is slightly entangled, it is possible to act 
only on one of the subsystems in a way which some- 
times produces more entanglement. Although using only 
local quantum operations, and possibly classical commu- 
nication, we cannot increase entanglement on average, 
Bennett et al |36| discovered that we can gamble a small 
amount of initial entanglement with the possibility of ob- 
taining more. This idea of obtaining, with some proba- 
bility, a gain which cannot be acquired deterministically 
also lies at the heart of unambiguous discrimination. We 
shall see that the relationship between these two opera- 
tions is far from superficial |2^,^]. In fact, the local op- 
eration on one of the entangled subsystems which trans- 
forms the entire state into a maximally entangled state 
with maximum probability is also the operation which 
performs optimal unambiguous discrimination between a 
related set of symmetrical states [^9j , which we discussed 
in the preceding section. 

It is helpful to begin by explaining what an entangled 
state is. Here, we shall consider only pure states. Sup- 
pose that Alice and Bob possess two quantum systems, 
A and B. If these systems have been prepared indepen- 
dently in the states {ip 1 } and \ij} 2 } respectively, then the 
state of the combined system will be of the form 

\i>) = \^) A \^) B - (4.1) 

Such a state is known as a product state. The signifi- 
cance of this form becomes apparent when we calculate 
expectation values of physical observables. Let a be an 
operator observable for A, and (3 be one for B. Then the 
expectation value of the product af3 is simply 

(VM^> = (^M^WW 2 ), (4.2) 

that is, it is simply the product of the expectation val- 
ues of the two observables. If these two operators are 
projection operators, then they represent propositions, 
and their expectation values arc the probabilities that 
these propositions are true. The product a/3, which is a 
projection operator on the space of the combined system, 
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represents the logical 'and' of these two propositions. We 
see then from Eq. (4.2) that the probability of 'a and (3' 
being true is simply the product of the probabilities of 
a being true and [3 being true. This implies that the 
truth probabilities of these two propositions are uncorre- 
cted. Since these propositions are completely arbitrary, 
no property of particle A has any correlation with any of 
particle B. 

The product state in Eq. (4.1) is not, however, the 
most general type of pure state of A and B. The su- 
perposition principle implies that the state of the entire 
system can be any linear combination of product states 
such as those in Eq. (4.1). An entangled state is such 
a superposition which cannot be expressed as a single 
product state. The product rule in Eq. (4.2) for ex- 
pectation values of local observables does not generally 
hold for such states. One of the most extensively studied 
entangled states is the singlet state 
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!«>*)• 



(4.3) 



The states \u) and \u'} are orthogonal for each parti- 
cle, and the singlet state is said to be maximally entan- 
gled. We shall shortly examine the problem of quantify- 
ing the amount of entanglement in general states. For the 
moment though, we shall consider only the singlet state 
as this can be used to show that, if it were possible to 
discriminate between arbitrary states, then information 
could be transmitted instantaneously. 

We begin by noting that, since \u) and \u'} are orthog- 
onal for each system, we can construct the Hcrmitian 
operators Ua and Ub which have these states as their 
eigenstates. A measurement of one of these operators can 
be used to distinguish perfectly between these states. 

The singlet state has an interesting symmetry prop- 
erty, in that the states \u) and \u') can be any pair of 
orthogonal states at all, and \ip) still has this form shown 
in Eq. (4.3). We can then define another pair of orthog- 
onal states, \v) and \v') for each system, which are the 
eigenstates of the Hermitian operators Va and Vb, and 
rewrite \tb) as 



\^) = ^{\v) A \v') B -\v') A \v) B ). 



(4.4) 



Measurements on entangled states such as enforce 
non-local correlations between the subsystems. For ex- 
ample, if Alice measures U, then the state of her particle 
will collapse into either \u) or \u'). If Bob were then to 
carry out a measurement of U on his system, he would 
obtain, with unit probability, the opposite result, so that 
if Alice obtained \u), Bob would obtain \u'), and vice 
versa. The same holds true for the states \v) or \v'), or 
any other pair of orthogonal states. Alice is then able to 
predict the result of Bob's measurement, immediately, if 
he measures the same observable as her. If no signal can 
travel from Alice to Bob faster than the speed of light, 



then we might be led to conclude that the information Al- 
ice obtains about Bob's subsequent measurement already 
exists in Bob's particle. However, since the observable is 
arbitrary, it would then follow that the information de- 
scribing the results of all possible measurements on Bob's 
particle must already exist. Such an interpretation would 
no involve superluminal communication. This hypothe- 
sis, known as local-realism, is in sharp contrast to the 
idea of complementarity. Einstein, who discovered these 
correlations with Podolsky and Rosen |38| in 1935, ex- 
pressed a preference for local realism over the alterna- 
tive, which he imagined must be related to some kind of 
'spooky action at a distance'. 

In local-realistic theories, the apparent randomness of 
the results of quantum measurements is considered to be 
an illusion, perhaps due to our ignorance of some other 
significant parameters or hidden variables. It is then im- 
portant to determine whether or not the predictions of 
quantum mechanics can be reproduced by a local-realistic 
hidden variable theory. However, in 1964, Bell published 
a theorem, according to which the correlations produced 
by any such theory must satisfy a certain inequality [ |39[ . 
For suitable parameter choices, the predictions of quan- 
tum mechanics violate Bell's inequality for all pure entan- 
gled states, but not for product state s p(| , and curiously, 
not for some mixed entangled states |47[]. The predictions 
of quantum mechanics have generally been vindicated by 
experiment, most famously in the experiments conducted 
by Aspect et al Q. 

If these correlations are genuinely non-local, and do 
not result from the relativistically causal transmission 
of information between the two systems, can Alice and 
Bob use them to transmit information to each other? If 
Bob could discriminate, with zero probability of error, 
between the four states \u),\u'),\v) or \v'), then he could 
tell whether Alice measured Ua or Va- If she wishes to 
transmit '1' to Bob, she measures U . If on the other 
hand, she wishes to communicate a '0' to him, she would 
measure Va instead. 

The nonlocal nature of entanglement cannot be used to 
transmit information in this manner, which would avoid 
the actually sending of physical systems, and thus the 
universal speed limit c. General proofs of the impossibil- 
ity of superluminal communication using entanglement 
and measurement have been obtained |43| . On the ba- 
sis of the above argument, these proofs must implicitly 
place restrictions on the extent to which the state of a 
quantum system can be determined. 

Although entanglement cannot be used for superlumi- 
nal communication, it does have several other applica- 
tions. For example, it can be used, in conjunction with a 
classical communication channel, to teleport an unknown 
quantum state from one location to another |Q. It can 
also be used to transmit classical information at twice 
the maximum rate that can be achieved using classical 
physics, using a technique known as superdense quantum 
coding |l5|. Also, many of the recently discovered advan- 
tages of using quantum systems for computing (the best 
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known of which is Shor's algorithm for factorising a num- 
ber in polynomial time |Q , for which the best known clas- 
sical algorithms require exponential time) make explicit 
use of entanglement to carry out computations more ef- 
ficiently than any computer operating solely by the laws 
of classical physics can manage. 

It is therefore important to understand the conditions 
under which entanglement can be manipulated. In par- 
ticular, how do we quantify entanglement? If Alice and 
Bob share some entangled state \ip), how much entangle- 
ment does it contain? 

To answer this, we have to understand the most general 
form of an entangled state. We arrived at the concept of 
entanglement via the superposition principle, which en- 
abled us to construct a linear combination of product 
states which is not itself a product state. The most gen- 
eral pure state of a two-particle system is simply the most 
general superposition of product states. For a pair of TV- 
level quantum systems, this is 



N 

E 



B ' 



(4.5) 



where, without loss of generality, the subsystem states 
\atj) and \Pk) are taken to be orthogonal, so that 

The bjk are almost completely free parameters. They 
are subject only to this normalisation constraint. Such 
a large number of free parameters makes the expression 
in Eq. (4.5) somewhat unwieldy. Fortunately, a simpler 
representation of \ip) can be obtained using an important 
result known as the Schmidt decomposition theorem jl6| . 
This states that there exists an orthogonal basis \a'j) for 
particle A and for particle B such that the state 
takes the form 
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(4.6) 



We are now in a position to quantify the amount of entan- 
glement in the state \ip). The entanglement, or entropy 
of entanglement, E(ip), is given by the von Neumann en- 
tropy |47[] of either of the reduced density operators: 



E(ip) = -Tr A p A logp A = Tr B p A logp B 
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(4.9) 



The logarithm is conventionally taken to have base 2. 
Entanglement is measured in ebits. If only one Cj is non- 
zero, corresponding to a product state, then E(ip) = 0. 
At the other extreme, if all \cj\ 2 are equal, then normali- 
sation implies that \cj\ 2 = 1/N, and E(ip) = \ogN. Such 
states are called maximally entangled states since they 
possess the most entanglement for a given N. The sin- 
glet state in Eq. (4.3) is a maximally entangled state, 
and has 1 ebit of entanglement. 

One of the major advantages of using the entropy of 
entanglement to quantify this property is the fact that it 
is additive. To understand the meaning of this, suppose 
that Alice and Bob share 2 entangled states, \ifj) and \tp')- 
How much entanglement do they possess? Considering 
these two entangled systems individually, we would con- 
clude that the total entanglement shared by Alice and 
Bob is just the sum of the entanglements of and 
If, however, we consider them to be composite parts of a 
larger entangled system in the state \ipi,) = then 
the entanglement shared by Alice and Bob is that in \iI>l)- 
Clearly, to quantify shared entanglement unambiguously, 
it is necessary that an entanglement measure E satisfies 
E(ip L ) — E(ip) + E(ip'). Fortunately, the entropy of en- 
tanglement has this desirable additive property for pure 
entangled states. However, no entanglement measure has 
yet been shown to be additive over the set of all mixed 
entangled states. Recently, though, additivity has been 
shown to hold for one of the most important entangle- 
ment measures, the relative entropy of entanglement, for 
a large class of states 1 48 . 



i.e. using these special bases, known as the Schmidt 
bases, we can write the state as a single, rather than 
a double sum over product states. 

For a product state, only one of the Cj is non-zero. If 
more than one is non-zero, the state is entangled. To 
quantify this entanglement, we have to form the density 
operator for one of the subsystems. These are known 
as reduced density operators, and are denoted by p A and 
Pb- The reduced density operator of either subsystem 
is formed by taking the trace of density operator of the 
entire system with respect to the other subsystem, i.e. 



p A = Ti B (\4>){il>\), 
We find that these are 
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p B = Tr A (|^)(^|). (4.7) 
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B. Entanglement concentration and unambiguous 
state discrimination 

The entanglement E(ip) cannot be deterministically in- 
creased by acting on the subsystems individually, even 
if classical communication is allowed between Alice and 
Bob. However, it is possible to use an unambiguous 
discrimination-type measurement on either A or B which 
will, with some probability, convert \ip) into a maximally- 
entangled state. Such an operation is known as entangle- 
ment concentration. For the sake of definiteness, we let 
the measurement be performed by Alice on particle A. 
We first make use of a new orthogonal basis set \yk) for 
Bob's particle. These states are defined through 
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This expression allows us to rewrite the partly entangled 
state \tp) in Eq. (4.6) as 



N 



(4.11) 



where we have introduced another set of new states \xk), 
defined by 

|xO=f>xp(^W (4.12) 

3=1 \ / 

Both the orthonormality of the \yt-) and the representa- 
tion in Eq. (4.11) follow from the relation 
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"£exp[ -^ N J )=N5 jj ,. (4.1.:-!! 



The \xk) are normalised, although they are not orthogo- 
nal. If they were, then the expression in Eq. (4.11) for 
would represent a maximally-entangled state, which 
it cannot be since our transformation of basis is passive, 
doing nothing to change the entanglement of the state. 
These states are, however, linearly independent. In fact, 
comparing Eq. (4.12) with Eq. (3.12), we see that they 
constitute a set of linearly independent symmetric states 
of the kind we discussed in the preceding section. 

Looking at Eq. (4.11), we can say that what pre- 
vents from being maximally entangled is the non- 
orthogonality of the \xk)- However, unambiguous dis- 
crimination can be regarded as an operation which trans- 
forms non-orthogonal states into orthogonal ones. This 
was made explicit in the experimental realisation of un- 
ambiguous discrimination for a pair of states (2^j2^] , dis- 
cussed in the preceding section. To see how this can 
be exploited to transform with some probability, 
into a maximally- entangled state, consider the detec- 
tion operators for unambiguous discrimination between 
the \xk) ■ It follows from Eq. (3.10) that these are 
n fc = Pk\xi)( x k \/\( x k\ x k)\ 2 - The states \x£) are the 
reciprocal states corresponding to the \xk)- These op- 
erators, together with II? = 1 — J2k=i^ k > represent a 
measurement whose possible outcomes are the N states 
| ark), and the inconclusive result. To use this type of 
measurement for entanglement concentration, it is more 
appropriate to consider just two outcomes, described by 
the inconclusive result operator, II?, and the following 
operator 
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(4.14) 



By definition, this pair of operators forms the required 
resolution of identity in Eq. (2.13). The reason for sub- 
script O in Eq. (4.14) will become apparent shortly. This 



pair of detection operators describes a measurement hav- 
ing two outcomes: success or failure of the state discrim- 
ination measurement. It does not tell us which state has 
been detected when it succeeds. This might not seem 
very useful at first sight. However, the state transfor- 
mation generated by this measurement is precisely that 
which transforms into a maximally entangled state. 
As we discussed in section II, to determine how the state 
of a system is transformed by a generalised measurement, 
we need to find an operator A such that U = A^A, when 
the result corresponds to the detection operator II. For 
the operator ITo, the corresponding transformation op- 
erator is the orthogonalisation operator Aq 



Ac 
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where the states \<j>k) may be any orthogonal basis. The 
arbitrariness of this basis is equivalent to that of the uni- 
tary transformation in Eq. (2.14). As we saw in the 
preceding section, the maximum probability of discrimi- 
nating between a set of N symmetric states is obtained 
when all of the Pk are equal to 1 — P? (opt) , where P? (opt) 
is given by Eq. (3.13). 

Using the prescription for state transformations in Eq. 
(2.15), we see that if the system is initially prepared in 
one of the non-orthogonal states \xk), when this measure- 
ment succeeds, then the state will be transformed into the 
corresponding member of the orthogonal basis \4>k)- In 
fact, the unambiguous discrimination measurement can 
be regarded as this orthogonalisation procedure, followed 
by a von Neumann measurement in the orthogonal ba- 
sis \4>k)- This is precisely what happens in the photonic 
implementation of two-state discrimination we discussed 
in section III. There, the non-orthogonal photon states 
were, with probability 1 — P?(opt), transformed into the 
orthogonal states in Eq. (3.8) before being discriminated. 

Looking at the representation we have for the entan- 
gled state \ifj) in Eq. (4.11), we see that if Alice carries 
out this operation on particle A, and if it succeeds, then 
the final state will be 
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which is our promised maximally-entangled state. The 
probability of success for this operation is given by 
1 — A^xminlcjl 2 . As it happens, this is the maximum 
probability of converting \ip) into a maximally- entangled 
state. The maximum probability of converting a non- 
maximally entangled pure state of two systems into a 
maximally-entangled one, using only local quantum op- 
erations and possibly classical communication, was found 
by Lo and Popescu |49) . The strategy we have been dis- 
cussing reaches their bound. 

Several additional results on transforming one entan- 
gled state into another using only local operations and 
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classical communication have been found. An impor- 
tant question is: under what circumstances can one pure 
entangled state be transformed into another with unit 
probability? The solution to this problem was obtained 
by Nielsen [M, whose results introduced the powerful 
mathematical technique of majorisation to the study of 
entanglement. 

Deterministic transformations of this kind are not pos- 
sible for all pairs of states. Vidal J51| obtained a general 
expression for the maximum probability that any pure 
entangled state can be converted into any other. 

V. UNAMBIGUOUS DISCRIMINATION AND 
EXACT CLONING 

A. Relationship between quantum cloning and state 
discrimination 

Another operation closely related to unambiguous dis- 
crimination is exact cloning. In 1982, it was discovered 
independently by Wootters and Zurek [|52| and Dieks ]55| ] 
that the state of a quantum system, if unknown, cannot 
be copied. As with discrimination, no completely reli- 
able procedure exists for this unless the state belongs to 
a known orthogonal set. 

By analogy with the possibility of unambiguous dis- 
crimination between linearly independent states, it is 
pos sible, as was initially demonstrated by Duan and Guo 
|p6fl , to build a machine which, with some probability, 
produces exact copies of such states. We will examine 
the relationship between these two operations, in partic- 
ular, that between their maximum success probabilities. 
In fact, for just two states, both operations can be re- 
garded as particular cases of a more general procedure 
known as quantum state separation |57[| , which we will 
also describe. 

To understand the relationship between state discrim- 
ination and cloning, suppose that Alice gives Bob one 
of the N quantum states \ipj). He isn't told which, al- 
though again, he knows what the possible states \ipj) are, 
and also their a priori probabilities r/j . If he can discrim- 
inate between them, then upon identifying the state, he 
can manufacture as many further copies of it as he de- 
sires. Therefore, if he can discriminate between them, 
then he can also clone them. 

If, on the other hand, Bob could clone a set of states, 
then he could also discriminate between them. This fol- 
lows from the fact that if Bob could make one copy, he 
could make arbitrarily many. He could then make use of 
the fact that, given a sufficiently large number of copies 
of the state, he could determine the expectation value of 
any observable, to an arbitrarily high degree of accuracy, 
by repeatedly measuring it on the members of his ensem- 
ble of clones. If the ensemble is large enough, he could 
evaluate the expectation values of several observables, in 



fact, any finite number of them, to any degree of accu- 
racy. If he chooses the correct observables, it would be 
possible for him to infer the state itself from the expec- 
tation values. For an iV-dimensional system, the density 
operator is specified by -/V 2 — 1 independent real param- 
eters. It is quite easy to see why. The density operator 
has N 2 elements, each of which is complex. It is there- 
fore determined by 27V 2 real parameters. There are TV 2 
constraints due to Hermiticity, and a further constraint 
comes from the requirement of normalisation. Therefore, 
to determine the state, Bob must know the expectation 
values of at least N 2 — 1 observables. Fortunately, with a 
judicious choice of observables, this lower bound can be 
attained. 

The simplest example is the case of a two-level system 
or qubit. The state p of a qubit can be conveniently 
expressed in the Block representation. We are already 
familiar with the representation of a qubit as a spin-1/2 
particle, and the eigenstates |±) of the z component of 
the spin, this being represented by the operator a z . It 
remains to introduce the other Cartesian components o~ x 
and o~ y of this vector operator, the Pauli spin operator 
cr. These component act on the eigenstates of er 2 in the 
following way: 

«7 S |±> = |=F>, a v \±) = ±i|T>. (5.1) 

All three Cartesian components of cr are both Hermitian 
and unitary operators (implying that the square of each 
of them is 1). They also have eigenvalues ±1, so that their 
trace is zero. The Bloch representation of the density 
operator p is obtained by writing it as a combination of 
these operators and the identity, 

/0 =i(l + a. t r). (5.2) 

The components of the Bloch vector a are real and the 
length of this vector |a| is no greater than 1. If it is equal 
to 1, then p is a pure state. If it is equal to 0, then 
p = 1/2, meaning that it is a completely mixed state. 
Thus, we may take the length of the Bloch vector to be 
an indicator of how pure the state is. 

Determination of the state p of a qubit clearly amounts 
to finding the components of the Bloch vector, whose 
3 components are a special case of the general number 
of parameters N 2 — 1. To evaluate these components, 
Bob need only measure the expectation values of each 
(Tfe. These are equal to Tro^p = a^, where k = x,y,z. 
This can be seen from the fact that has zero trace, 
and from the identity Tracer; = 2Ski- 

While it is useful to know that, in principle, the ex- 
pectation values of a set of observables are sufficient to 
determine the state of a system, in practice, these quan- 
tities cannot be measured exactly, as this would require 
an infinite number of measurements to be carried out. 
Derka et al [|54| showed how the state can nevertheless 
be estimated from the available data using Bayesian in- 
ference techniques. 
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Using standard von Neumann measurements, the state 
can be extracted from a large number of copies by mea- 
suring iV 2 — 1 observables. An interesting question is: 
can this number of types of measurement be reduced if 
we use generalised measurements instead? This issue was 
explored by Peres and Terno |p5| , who came to the in- 
triguing conclusion that only a single generalised mea- 
surement is necessary. Their strategy is essentially as 
follows: consider a generalised measurement with ./V 2 — 1 
detection operators II^. If we have a large number of 
systems, all prepared in the same unknown state p, then 
we can determine the probability of the fcth outcome, 
L0 k , which, by Eq. (2.11), is given by P(uo k \p) = TYpIIfc. 
What Peres and Terno realised was that there exist gen- 
eralised measurements for which the TV 2 — 1 probabilities 
P{u>k\p) are one possible set of parameters which can be 
used to infer the density operator itself. 

From the above discussion, it is apparent that if Bob 
can copy the state he has, then he can determine it, and 
vice versa. Unambiguous state discrimination is possi- 
ble only if the state belongs to a known, linearly inde- 
pendent set. On the basis of the above argument, we 
should expect that the same constraint limits the abili- 
ties of cloning machines. This is indeed the case. It was 
recently discovered by Duan and Guo ]56| that only lin- 
early independent states can be cloned, and only with 
unit probability if they are orthogonal. 



B. Exact cloning and unambiguous state 
discrimination 

An intriguing question is the following: given this sym- 
biotic relationship between discrimination and cloning, 
the essence of which is that the conditions under which 
one of these operations is possible also apply to the other, 
are there quantitative relationships between their opti- 
mal figures of merit? On the basis of the relationship 
between unambiguous discrimination and exact cloning, 
we would expect their maximum success probabilities to 
be related. 

We will see how to obtain a bound on the maximum 
probability of cloning two equally probable states using 
the bound on the probability of discriminating between 
two states, given by the Ivanovic-Dieks-Peres limit in Eq. 
(3.5). We then show that this cloning bound also leads to 
the IDP bound. Finally, we will show that both bounds 
are special cases of a more general limit which relates to 
an operation known as quantum state separation. 

For two equally-probable pure states \ip±), the max- 
imum probability of success for unambiguous discrimi- 
nation is given 1 — |(^!+|^!_)|, from the IDP limit in Eq. 
(3.5). It follows from this that if Bob has M copies of the 
system, then the maximum probability of discriminating 
between these M copies \tp±) v ■ - \ip±) M is 



PMooM) = 1 - \(4>+\4>-)\ 
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The use of the notation Pmoo will become apparent 
shortly. If this is the maximum probability with which 
one can discriminate between these M-particle states, 
then it is impossible to improve upon this bound by the 
following procedure. We first attempt an N from M 
cloning operation, that is, to transform these M copies, 
together with N — M particles in 'blank states', into N 
copies, where N>M. If this succeeds, we then attempt 
to discriminate between the iV-particle products, which 
cannot be accomplished with probability greater than 



PjVoo(opt)=l-|(^ + ^-)| 
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(5.4) 



The cloning probability, which we shall write as Pmn, 
must be constrained by the fact that this compound 
operation cannot be accomplished with probability 
greater than -Pm oo (opt). If this were not true, then 
Pn/oo could not be the maximum probability of dis- 
tinguishing between M copies of or \ip~). Thus, 
Pj\/oo(opt)>PAfArPjv 00 (opt). In fact, it was shown in |57| ] 
that the equality here can be attained, implying that the 
maximum cloning probability is 
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Pmn (opt) = 
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This generalises an earlier result by Duan and Guo |5q ] 
that the maximum probability of making two copies of 
the state given one initially is 



Pi 2 (opt) = 
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(5.6) 
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Exact cloning in this manner has not yet been realised in 
the laboratory, although a quantum-computational net- 
work which achieves this task has been proposed [ p9[ . 
A further recent development in the study of probabilis- 
tic cloning machines is that it is possible for the actual 
number of copies to be a quantum, rather than a classi- 
cal variable. Pati fjofl has shown how one can construct 
a 'novel' cloning machine which, with some probability, 
will generate a superposition of various numbers of exact 
copies. 

We see that the bound on the maximum probability 
of unambiguous discrimination implies a corresponding 
bound on the maximum probability of exact cloning. As 
it happens, the bound on exact cloning Pmn (opt) also 
implies that Pioo(opt) in Eq. (5.3) is actually the max- 
imum probability of unambiguously discriminating be- 
tween the states \ip±). 

Given one initial copy of the state, the maximum 
probability that we can make N copies is given by 
Plat (opt). We can see from Eq. (5.5) that as N-^oo, 
Piiv(opt)— >1 — from above. In this limit, the 

state could be inferred through the statistics of appropri- 
ate measurements on the copies, so we have shown how 
Eq. (5.5) implies that the states can be discriminated 
unambiguously with probability Pioo(opt). Consistency 
with the cloning bound implies that no greater value than 
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-Pioo(opt) can be attained. If state discrimination could 
be accomplished with probability higher than Pioo(opt), 
then we could, with the same probability, make an ar- 
bitrarily large number of copies of the state given one 
initial realisation. If this probability was greater than 
-Pioo(opt), it would also exceed Pin for sufficiently large 
N. Therefore, the discrimination bound can also be ob- 
tained from the cloning bound. 



C. Quantum state separation 

We will conclude this section with a brief discussion 
of a general quantum operation which has unambiguous 
discrimination and exact cloning as special cases. This 
operation is known as quantum state separation. 

Consider two non-orthogonal quantum states \<j>±). We 
would like to know the maximum probability with which 
these can be transformed into another pair of quantum 



states 



such that 



(5.7) 



The (modulus of the) overlap of the final states is less 
than that of the initial states, hence the term state sepa- 
ration. In [57[ ], it is shown that if both states have equal 
a priori probabilities, then the maximum value Ps(opt) 
of this separation probability is 
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This operation corresponds to unambiguous discrimina- 
tion when the final states \(f>±) are orthogonal. In this 
case, the denominator is equal to 1, and Ps (opt) is equal 
to the Ivanovic-Dieks-Peres limit on the probability of 
conclusively distinguishing between the states \4>±)- 

Suppose instead that represents M copies of 

the state \4>±), together with N — M particles in some 
collective 'blank' state |x). We then have \<fi±) = 
■ m\x) • We take the final states \<f>±) to be N 
copies of the state |V>±), that is, = \tp±) v ■ .\ip±) N . 
The modulus of the overlap between the final states, 
| {4>\ 1 4>V) | = \{<i/) + \ip-)\ N , is less than that of the cor- 
responding initial states, |(0+|</>L)| = | (ip+\ip-) | . Ex- 
act cloning is then a further example of state separation. 
Substitution of these expressions into Eq. (5.8) gives the 
maximum cloning probability Pm n (opt) in Eq. (5.5). 

In our discussion of state discrimination, we saw the 
importance of examining what happens to the state of 
the system when the operation fails. The erasure of in- 
formation that takes place there also occurs for the more 
general state separation operation, and therefore also for 
cloning. When an optimal state separating operation 
fails, the possible initial states \(f)±) are transformed into 
the same state, rendering any further separation attempt 
impossible. 



VI. UNIVERSAL STATE ESTIMATION AND 
CLONING 

A. Estimating a completely unknown state 

So far, we have been examining the problem of trying 
to discriminate, as best as we can, between members of 
a known, finite set of states. Suppose that the state is 
completely unknown. In the two level case, Alice might 
give Bob a qubit prepared in the state 
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and Bob has no information at all about the values of the 
coefficients a and b (except, of course, that |a| 2 + |6| 2 = 1, 
for normalisation). The strategies we have examined for 
finite sets are not useful here. The set of possible states 
is clearly linearly dependent, so unambiguous discrimi- 
nation between them is impossible. More generally, any 
realistic detection strategy will have a finite number of 
outcomes. Since we are dealing here with an infinite set 
of possible states, we cannot uniquely associate one out- 
come with every possible state, even if we allow for errors. 
Massar and Popescu |HJ and Derka, Buzek and Ekert 
p2| examined this problem from a different perspective, 
proposing a more realistic strategy known as quantum 
state estimation. The problem can be formulated as a 
game. Alice gives Bob M copies of the state \ip), and 
his task is to perform a measurement with K outcomes. 
On the basis of the outcome he records, he will conjec- 
ture that the state was a member of some finite set of 
states \u>k), where k = 1, . . ,,K, In general, his guess will 
be wrong, and the idea is to construct the measurement 
such that the conjectured state is, on average, as close as 
possible to the actual state. 

The accuracy of Bob's guess is measured by a score 
function. Slightly different score functions are chosen by 
the two sets of authors, although their end results are the 
same. The experimental significance of their functions 
can be appreciated if the qubits are realised as photon 
polarisation states. Massar and Popescu take the score 
function to be cos 2 (a/2), where a is the angle between 
the actual and guessed directions of polarisation. Derka 
et al use cos 2 a. This latter choice has a significance for 
general quantum systems. It is known as the fidelity. 
If the actual state of the system is and the guessed 
state is then the fidelity F(u>\ip) is simply the square- 
overlap between them, 



FHV) = IH^)| i 



(6.2) 



The fidelity has the following interpretation. Consider 
a measurement designed to determine whether or not a 
quantum system has been prepared in the state \lo). The 
best measurement is a so-called maximal measurement. 
This is a von Neumann measurement of an operator ob- 
servable which has \lo) among its eigenstates. If the initial 
state of the system is then the fidelity F in Eq. (6.2) 
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is the probability that the result of this measurement is 
'yes'. The fidelity is then a natural and practical measure 
of how closely the states \u>) and \tp) resemble each other. 

In the context of polarisation, the significance of the 
fidelity can be appreciated by considering Malus' Law in 
its photonic form. This tells us that the probability that 
the actual photon state would pass through a polariser 
designed to transmit photons in the guessed polarisation 
state is cos 2 a, i.e., it is equal to the fidelity. 

The half-angle formula of Massar and Popescu is also 
useful for the following reason: if the fidelity score is used, 
then when the real and guessed polarisations are orthog- 
onal, we would obtain a score of 0. However, this corre- 
sponds to as much an information gain as in cases when 
the score reaches 1. This is because a fidelity of is only 
obtained when the guess state |w) and the actual state 
\4>) are orthogonal. A fidelity of 1 could then be obtained 
by replacing each guess state by the state orthogonal to 
it. However, this property presents no problems if we 
are interested in maximising the score function, averaged 
over all states, which we shall be. Nevertheless, Massar 
and Popescu's half-angle score function avoids this am- 
biguity. Its minimum value is 1/2, which corresponds to 
the actual state having probability of 1/2 of being in ei- 
ther the guessed state or the one orthogonal to it, which 
means no information gain at all. 

The best measurement for any score function is that 
which maximises the average of the score over all states. 
As it happens, the maxima of both score functions are 
equal, although for the sake of concreteness, we shall con- 
centrate on the fidelity. The average fidelity, given M 
initial copies of the unknown state \ip), is 

K 

Fm=J2 VW)P{ukW)F{uj k \il)), (6.3) 
fe=i J 

where P(wk\ip) is the probability that the guessed state 
is \u>k) given that the actual state is \ip). This function 
resembles the Bayes' cost function in Eq. (2.4), although 
the positive nature of a 'score' contrasts with the nega- 
tive nature of a cost, which implies that it is desirable 
to maximise Eq. (6.3), whereas we would prefer to min- 
imise the Bayes' cost Cb in Eq. (2.4). This distinction, 
however, is quite superficial: it amounts merely to a dif- 
ference of sign. There are more significant differences 
between these two figures of merit. One of the most 
obvious differences between Cb and Fm is that the lat- 
ter refers to a continuous set of states, while the former 
refers to a discrete set. Also, in evaluating the Bayes' 
cost, the possible states will, in general, have different a 
priori probabilities. In quantum state estimation how- 
ever, we have no a priori information about the state of 
the system, so the a priori probability density is uniform. 
Perhaps the most pertinent difference between the two 
strategies is that, in hypothesis testing, the number of 
outcomes is fixed: it is equal to the number of possible 
states. In contrast, there is no a priori information about 
the guess states \uk) in state estimation. All properties 



of the guess states, including how many of them there 
are, are to be determined through the optimisation pro- 
cedure, that is, the maximisation of Fm- 

Given M initial copies of the state, Derka et al showed 
that the maximum value of Fm is 

- , , M + 1 . . 

^M(opt) = _. (6.4) 

Massar and Popescu showed that this is also the maxi- 
mum of Fm if F(ojk\ip) is replaced by cos 2 a/2. We see 
that the maximum score increases with M until, in the 
limit as M-^oo, it attains the value of 1. 

Massar and Popescu proved the existence of a finite set 
of guess states which attains the optimum score. Derka et 
al provided an explicit algorithm for finding these states. 
Actually their algorithm gives the optimum measure- 
ment for any set of states generated from some reference 
state IV'o) by a unitary, finite representation of a compact 
Lie group. The Massar-Popescu paper is concerned with 
the group SU(2), that is, the group which generates all 
possible pure states of a qubit from the reference state. 

If Alice gives to Bob M copies of the state then he 
will obtain the guess state \uk) with probability P(u>k\tp)- 
What state then does Bob guess 'on average' ? It is sim- 
ply a mixture of the guess states weighted by their re- 
spective probabilities, which we denote by p gU ess- 

K 

Pguess P(^k\lp)\^k)(^k\- (6.5) 

fc=l 

An appropriate measure of how closely this average guess 
state resembles the actual state p gue ss is again given by 
the fidelity: 

K 

FmW = (4>\p gU essW = J2 p ( u} kH)F{uk\i>)- (6.6) 

fe=l 

The optimum mean fidelity Fjvf(opt) in Eq. (6.4) is just 
the average of the optimum Fm(iP) over all states \tp). 
The symmetry of the optimal measurement implies that 
Fm{^) must actually be independent of \ip). This implies 
that the average guess state p gue ss must have the form: 

Pguess = \{l-S M ) + S M \i)){ll)\. (6.7) 

where 0<5m<1- The quantity Sm is known as a shrink- 
ing factor, for the following reason. Let us write the den- 
sity operator IV') ("01 m the Bloch representation described 
in the preceding section: that is, = (1 + a.cr)/2, 

for some unit vector a. Then one can show using Eq. 
(6.7) that p g uess has the Bloch representation 

Pguess = ^(1 + SWa.er). (6.8) 

The shrinking factor Sm decreases the length of the Bloch 
vector a, although it's direction in maintained. If Sm = 
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1, then p guess is equal to the original state \tp) (ip\ . On the 
other hand, if Sm = 0, then p gue ss is completely mixed, 
and contains no information about the initial state. 

The optimal state estimation strategy maximises the 
average fidelity Fm- It also maximises the shrinking 
factor Sm- This can be seen in the following way. 
The shrinking factor is independent of the actual state 
Therefore, Fm(iP) in Eq. (6.6) is simply equal to 
(1 + Sm)/2. Substituting this into Eq. (6.3) shows that 
Fm must have the form constant x (1 + 5m)/2. However, 
this constant is unity, due to normalisation of the inte- 
gral. Thus, we have 



F M (opt) = i(l+5 M (opt)), 



leading to 



S M (opt) 



M 



M 



(6.9) 



(6.10) 



If only one copy of the state is initially available, then 
the maximum value of the shrinking factor is 1/3, and 
the Bloch vector is reduced to 1 /3 of its former length. 
However, the shrinking factor grows with increasing M 
until, in the limit as M— >oo, the shrinking factor tends 
to unity. 



B. Universal cloning machines 

As with unambiguous discrimination, the optimal fig- 
ure of merit in Eq. (6.4) for universal state estimation is 
intimately related to the optimal efficiency of cloning. In 
this case, it is the optimal efficiency of universal cloning. 
The idea of a universal quantum cloning machine (UCM) 
was conceived by Hillery and Buzek J63[ . 

The idea is essentially this: Alice gives Bob M copies 
of a quantum system prepared in some state \tp). All 
states are equally probable. Now, we know that only 
linearly independent states can be cloned exactly, so the 
copies produced by such a cloning machine will necessar- 
ily be imperfect. The degree of imperfection of the clones 
is most easily expressed using the Bloch representation. 
Universal cloning machines are designed to copy all states 
equally well, and are thus of a highly symmetrical nature, 
If all of the clone states are required to be identical, one 
of the consequences is that the direction of the Bloch vec- 
tor is identical to that of the original state. However, its 
length decreases by a shrinking factor Smn which is in- 
dependent of the state cloned and depends only upon M, 
the number of initial, exact copies and N, the number of 
final approximate copies. If the initial copies are of the 
form shown in Eq. (5.2), then the output ones look like 



Pout = -(1 + Smnb-ct)- 



(6.11) 



Optimising a universal cloning machine means minimis- 
ing the decrease in the length of the Bloch vector. Bruss 



et al ]64| ] showed that the optimum, that is, the maximum 
value of this shrinking factor is 



Smn (opt) = 



M (N + 2) 
N{M + 2)' 



(6.12) 



As in universal state estimation, the performance of a 
UCM can be characterised by either the shrinking factor 
or the fidelity between the actual state and the state ob- 
tained. We shall denote by Fmn the fidelity between the 
output state of a UCM and the exact state given that 
M copies of \ip) were supplied to the UCM, which then 
produced N > M imperfect copies. As a consequence of 
symmetry, Fmn, like the shrinking factor, is independent 
of From the definition Fmn = (ip\p ut\ip) , we easily 
find that Fmn = (1 + Smn)/2, leading to 



Fmn (ppt) 



M+N+ MN 
N(M + 2) ' 



(6.13) 



This result had been obtained previously by Gisin and 
Massar |65| for l^N cloning. Although they proved it 
to be optimal for N<7, they conjectured it to be opti- 
mal for all N. This conjecture was confirmed by Bruss, 
Ekert and Macchiavello |36). These authors also showed 
that the optimum figures of merit for universal cloning 
are intimately related to the maximum fidelity obtained 
in universal state estimation. The relationship between 
universal state estimation and cloning is most easily ex- 
pressed in terms of their respective shrinking factors Sm 
and Smn- As is the case with unambiguous discrimina- 
tion and exact cloning, the bounds on the optimal figures 
of merit for both operations imply each other. 



C. Relationship between state estimation and 
universal cloning 

In the remainder of this section, we will present the 
arguments of Bruss et al which show that, given the op- 
timal shrinking factor Smn (opt) for cloning, we can di- 
rectly deduce the optimum shrinking factor, Sm (opt), for 
state estimation, and vice versa. To deduce the optimum 
shrinking factor for state estimation from that on cloning, 
suppose that an optimum state estimation measurement 
is carried out on M copies of an unknown state We 
can use this state estimation procedure to make approxi- 
mate copies of the state in the following way: every time 
the guess state \u>k) is obtained, we make N — M further 
copies of this guess state, for some N > M. The shrink- 
ing factor for the average guess state of each of these 
copies is identical to that for optimum state estimation. 
However, it cannot exceed that for an optimum UCM: 
it can at most equal it, in which case the procedure we 
describe would actually be an optimum UCM. This leads 
to the inequality 



Smn (opt) > S M (opt) , 



(6.14) 
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for all N > M. On the other hand, suppose that given 
M initial copies of the state we send these states to 
an optimum M— »iV UCM. The N approximate copies of 
the state \ip) are then subjected to an optimal state es- 
timation measurement. The concatenation of an A/— »iV 
UCM and a state estimation measurement on the N ap- 
proximate copies cannot lead to a higher shrinking factor 
than a state estimation measurement on the M origi- 
nal copies (for much the same reason that a probabilistic 
cloning machine cannot be used to increase the proba- 
bility of unambiguous discrimination, as we saw in the 
preceding section). Bruss et al showed that this leads to 
the inequality 



5 , A/oo(opt)<S , M (opt). 



(6.15) 



Combining inequalities (6.14) and (6.15), we see that the 
optimal shrinking factors for state estimation and infinite 
universal cloning are equal: 



SMoo(opt) = SW (opt). 



(6.16) 



The optimum shrinking factor, and thus fidelity, of 
state estimation can be deduced from that on universal 
cloning. 

Let us now see how the optimal shrinking factor for 
cloning can be deduced from the optimum state estima- 
tion shrinking factor. This follows from a natural prop- 
erty of an optimal UCM which we will describe first. 
Suppose that we initially have M copies of the state 
\ip). These are fed into an M-^N UCM, and the cor- 
responding shrinking factor is Smn- We then feed these 
N approximate copies to a further N^L UCM, for some 
L > N. This will shrink the Bloch vector further, by the 
shrinking factor Snl- Consider now an optimal Af— >_L 
UCM. If this has the shrinking factor Sml (opt), then 
clearly we must have 



5mjv5jvl<5ml (opt). 



(6.17) 



This holds for all L, in particular, in the limit as L^oo. 
It also holds whether or not the N^L UCM is optimum, 
so 



Smn< 



SMoc(opt) 
SWoo(opt) 



(6.18) 



To obtain an expression for the optimal shrinking fac- 
tor 5 m n (opt), we make use of two results. The first 
is the fact that, for concatenated optimal UCMs of the 
kind we have been describing, the shrinking factors mul- 
tiply: that is, inequality (6.17) becomes an equality when 
Smn = Sm n (opt). The second is Eq. (6.16), relating the 
shrinking factors for optimal infinite cloning and state es- 
timation. These results imply that 



-Smn- (opt) 



5 M (opt) 

5Af(opt) ' 



(6.19) 



Notice the formal resemblance between this relationship 
and that between the maximum probabilities of unam- 
biguous discrimination and exact cloning in Eq. (5.5). 



In fact, both arguments parallel one another, suggesting 
that a deeper and more general connection between de- 
termining the state of a quantum systems, and copying 
it, could be found. 

We conclude this final section by mentioning some in- 
teresting subsequent developments related to universal 
state estimation and cloning. The results we have dis- 
cussed in this section refer to an unknown state of a 2- 
level system. It is natural to enquire as to how these re- 
sults can be generalised to the case of multilevel systems. 
The generalisation of universal cloning machines to mul- 
tilevel systems has been fully worked out by Werner and 
Keyl [ |67|]6lj| ] , who gave an elegant mathematical charac- 
terisation of the optimal such transformations. 

The relationship between universal cloning and state 
estimation for multilevel quantum systems has also been 
explored by Bruss and Macchiavello J6^]. For general 
systems, this relationship is exactly as was shown for the 
2 level case, and the same arguments can be used to 
deduce the optimal figures of merit. 

Although we have examined the 2-lcvel case in some 
detail, the results we have described by no means tell 
the whole story, even about this simple case. We were 
interested in estimating, or copying, a state \tp), given M 
initial copies. Suppose that the physical systems which 
were prepared in this state were spin- 1/2 particles. The 
classical analogue of such a system would be M particles 
whose spins point in the same direction. We would thus 
expect that, if some of the spins were anti-parallel to 
the others, the information content would be the same, 
since they define the same spin axis. Gisin and Popescu 
||70f investigated this from a quantum mechanical point 
of view, and found that this is not the case, namely, that 
anti-parallel spins contain more information. 

One of the most exciting recent developments in the 
field has been the announcement, just a few days before 
completion of this article, that universal cloning of pho- 
ton polarisation has been carried out in the laboratory 
by Li et al |7l|] . This experiment, together with con- 
temporaneous demonstrations of optimal quantum state 
discrimination ]2l],^8| , provide further encouragement to 
explore the ultimate physical limits of information pro- 
cessing and transmission. 



VII. DISCUSSION 

Although the state of a quantum system is not itself an 
observable, we have seen how novel measurement strate- 
gies, which are completely consistent with the formalism 
of quantum mechanics, enable one to obtain information 
about it. 

Often these measurements are different from the stan- 
dard von Neumann type discussed in introductory quan- 
tum mechanics courses. They may instead be generalised 
measurements. These typically involve the interaction of 
the system with another ancillary system, after which a 
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von Neumann measurement is carried out on the latter. 
These more general measurement strategies offer greater 
flexibility and scope than operations performed upon the 
system of interest alone. One of the main advantages 
conferred by generalised measurements to the study of 
state discrimination is the fact that the number of out- 
comes is not limited to the dimension of the system's 
state space. In quantum hypothesis testing, this allows 
us to discriminate between an arbitrarily large number 
of states with some probability of obtaining a correct 
result. An additional outcome can correspond to incon- 
clusive results, which allows linearly independent states 
to be unambiguously discriminated. 

We also saw how state discrimination is related to 
other operations on quantum systems, such as cloning 
and the manipulation of quantum entanglement. The 
field of quantum information has enjoyed rapid growth 
over the past few years, and some of the most intriguing 
discoveries made about the information-theoretic proper- 
ties of quantum systems have been unifications of seem- 
ingly distinct concepts such as those we have discussed 
here. 
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